answer question
King's College team wins access to cutting-edge Google quantum chip
King's College team wins access to cutting-edge Google quantum chip Scientists from King's College London have become the first UK academic research team to gain access to Google's cutting-edge quantum computer chip Willow as part of a scheme launched with the UK's national quantum lab last year. Quantum computers can in theory solve problems which the most powerful conventional computers cannot. King's lead for the project Dr Eleanor Crane said its use of Willow would light a torch for research to answer questions about the most important natural processes. It would be useful if society could understand how plants transform sunlight into energy, find materials which transport electricity quickly, or how molecules bind to each other, said Crane, who will co-lead the research team alongside Dr Alexander Schuckert from ENS Paris. These natural processes rely on the interactions between many fundamental particles which made up the building blocks of life.
The Pentagon is planning for AI companies to train on classified data, defense official says
The generative AI models used in classified environments can answer questions but don't currently learn from the data they see. The Pentagon is discussing plans to set up secure environments for generative AI companies to train military-specific versions of their models on classified data, has learned. AI models like Anthropic's Claude are already used to answer questions in classified settings; applications include analyzing targets in Iran. But allowing models to train on and learn from classified data would be a new development that presents unique security risks. It would mean sensitive intelligence like surveillance reports or battlefield assessments could become embedded into the models themselves, and it would bring AI firms into closer contact with classified data than before. Training versions of AI models on classified data is expected to make them more accurate and effective in certain tasks, according to a US defense official who spoke on background with .
RECKONING: Reasoning through Dynamic Knowledge Encoding
Recent studies on transformer-based language models show that they can answer questions by reasoning over knowledge provided as part of the context (i.e., in-context reasoning). However, since the available knowledge is often not filtered for a particular question, in-context reasoning can be sensitive to distractor facts, additional content that is irrelevant to a question but that may be relevant for a different question (i.e., not necessarily random noise). In these situations, the model fails todistinguish the necessary knowledge to answer the question, leading to spurious reasoning and degraded performance. This reasoning failure contrasts with the model's apparent ability to distinguish its contextual knowledge from all the knowledge it has memorized during pre-training. Following this observation, we propose teaching the model to reason more robustly by folding the provided contextual knowledge into the model's parameters before presenting it with a question. Our method, RECKONING, is a bi-level learning algorithm that teaches language models to reason by updating their parametric knowledge through back-propagation, allowing them to answer questions using the updated parameters.
FHIR-AgentBench: Benchmarking LLM Agents for Realistic Interoperable EHR Question Answering
Lee, Gyubok, Bach, Elea, Yang, Eric, Pollard, Tom, Johnson, Alistair, Choi, Edward, jia, Yugang, Lee, Jong Ha
The recent shift toward the Health Level Seven Fast Healthcare Interoperability Resources (HL7 FHIR) standard opens a new frontier for clinical AI, demanding LLM agents to navigate complex, resource-based data models instead of conventional structured health data. However, existing benchmarks have lagged behind this transition, lacking the realism needed to evaluate recent LLMs on interoperable clinical data. To bridge this gap, we introduce FHIR-AgentBench--a benchmark that grounds 2,931 real-world clinical questions in the HL7 FHIR standard. Using this benchmark, we systematically evaluate agentic frameworks, comparing different data retrieval strategies (direct FHIR API calls vs. specialized tools), interaction patterns (single-turn vs. multi-turn), and reasoning strategies (natural language vs. code generation). Our experiments highlight the practical challenges of retrieving data from intricate FHIR resources and the difficulty of reasoning over them--both of which critically affect question answering performance.
Semantic World Models
Berg, Jacob, Zhu, Chuning, Bao, Yanda, Durugkar, Ishan, Gupta, Abhishek
Planning with world models offers a powerful paradigm for robotic control. Conventional approaches train a model to predict future frames conditioned on current frames and actions, which can then be used for planning. However, the objective of predicting future pixels is often at odds with the actual planning objective; strong pixel reconstruction does not always correlate with good planning decisions. This paper posits that instead of reconstructing future frames as pixels, world models only need to predict task-relevant semantic information about the future. For such prediction the paper poses world modeling as a visual question answering problem about semantic information in future frames. This perspective allows world modeling to be approached with the same tools underlying vision language models. Thus vision language models can be trained as "semantic" world models through a supervised finetuning process on image-action-text data, enabling planning for decision-making while inheriting many of the generalization and robustness properties from the pretrained vision-language models. The paper demonstrates how such a semantic world model can be used for policy improvement on open-ended robotics tasks, leading to significant generalization improvements over typical paradigms of reconstruction-based action-conditional world modeling. Website available at https://weirdlabuw.github.io/swm.
Can Large Language Models Bridge the Gap in Environmental Knowledge?
Smail, Linda, Calonge, David Santandreu, Kamalov, Firuz, Orak, Nur H.
The investigation employs a standardized tool, the Environmental Knowledge Test (EKT - 19), supple mented by targeted questions, to evaluate the environmental knowledge of university students in comparison to the responses generated by the AI models. The results of this study suggest that while AI models possess a vast, readily accessible, and valid kno wledge base with the potential to empower both students and academic staff, a human discipline specialist in environmental sciences may still be necessary to validate the accuracy of the information provided. Keywords: En vironmental Education; AI Models; EKT - 19 1. Introduction Extreme weather events, increasing global temperatures, rising sea - levels, and changes to ecosystems and biodiversity are all consequences of climate change, which is mostly caused by anthropogenic greenhouse gas emissions ( Masson - Delmotte et al., 2018). Meanwhile, the loss of biodiversity due to habitat degradation, pollution, overexploitation, and invasive species threatens the resilience of society's ecosystems (Nature, 2021). These consequences pose questions regarding food security, public he alth, and socioeconomic stability. Thus, effective access to accurate environmental knowledge is crucial for developing sustainable solutions and informed environmental policies.
What is Grok and why has Elon Musk's chatbot been accused of anti-Semitism?
Elon Musk's artificial intelligence company xAI has come under fire after its chatbot Grok stirred controversy with anti-Semitic responses to questions posed by users – just weeks after Musk said he would rebuild it because he felt it was too politically correct. On Friday last week, Musk announced that xAI had made significant improvements to Grok, promising a major upgrade "within a few days". Online tech news site The Verge reported that, by Sunday evening, xAI had already added new lines to Grok's publicly posted system prompts. By Tuesday, Grok had drawn widespread backlash after generating inflammatory responses – including anti-Semitic comments. One Grok user asking the question, "which 20th-century figure would be best suited to deal with this problem (anti-white hate)", received the anti-Semitic response: "To deal with anti-white hate? Here's what we know about the Grok chatbot and the controversies it has caused. Grok, a chatbot created by xAI – the AI company Elon Musk ...